The search functionality is under construction.

Keyword Search Result

[Keyword] low power(376hit)

121-140hit(376hit)

  • Low-Power Embedded Processor Design Using Branch Direction

    Gi-Ho PARK  Jung-Wook PARK  Gunok JUNG  Shin-Dug KIM  

     
    LETTER-High-Level Synthesis and System-Level Design

      Vol:
    E92-A No:12
      Page(s):
    3180-3181

    This paper presents a wordline gating logic for reducing unnecessary BTB accesses. Partial bit of the branch predictor was simultaneously recorded in the middle of BTB to prevent further SRAM operation. Experimental results with embedded applications showed that the proposed mechanism reduces around 38% of BTB power consumption.

  • An Approach for Reducing Leakage Current Variation due to Manufacturing Variability

    Tsuyoshi SAKATA  Takaaki OKUMURA  Atsushi KUROKAWA  Hidenari NAKASHIMA  Hiroo MASUDA  Takashi SATO  Masanori HASHIMOTO  Koutaro HACHIYA  Katsuhiro FURUKAWA  Masakazu TANAKA  Hiroshi TAKAFUJI  Toshiki KANAMOTO  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E92-A No:12
      Page(s):
    3016-3023

    Leakage current is an important qualitative metric of LSI (Large Scale Integrated circuit). In this paper, we focus on reduction of leakage current variation under the process variation. Firstly, we derive a set of quadratic equations to evaluate delay and leakage current under the process variation. Using these equations, we discuss the cases of varying leakage current without degrading delay distribution and propose a procedure to reduce the leakage current variations. From the experiments, we show the proposed method effectively reduces the leakage current variation up to 50% at 90 percentile point of the distribution compared with the conventional design approach.

  • Design of Voltage-Mode MAX-MIN Circuits with Low Area and Low Power Consumption

    Mohammad SOLEIMANI  Abdollah KHOEI  Khayrollah HADIDI  Vahid Fagih DINAVARI  

     
    PAPER-Device and Circuit Modeling and Analysis

      Vol:
    E92-A No:12
      Page(s):
    3044-3051

    In this paper, new structure of Voltage-Mode MAX-MIN circuit are presented for nonlinear systems, fuzzy applications, neural network and etc. A differential pair with improved cascode current mirror is used to choose the desired input. The advantages of the proposed structure are high operating frequency, high precision, low power consumption, low area and simple expansion for multiple inputs by adding only three transistors for each extra input. The proposed circuit which is simulated by HSPICE in 0.35 µm CMOS process shows the total power consumption of 85 µW in 5 MHz operating frequency from a single 3.3-V supply. Also, the total area of the proposed circuit is about 420 µm2 for two input voltages, and would be negligibly increased for each extra input.

  • Ultra Low Power Delay Element with Post-Chip Adjustable Ability

    Jung-Lin YANG  Chih-Wei CHAO  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E92-A No:12
      Page(s):
    3381-3389

    Our paper proposes a low power delay element with many other valuable characteristics for asynchronous circuits in the bundled-data implementation. Delay elements are frequently utilized to interact with asynchronous environment for revealing the current status of the bundled-data asynchronous circuits. Thus, a notable portion of the total energy is consumed by the delay elements for this kind of designs. Moreover, constructing a specific delay on a chip is a difficult task for recent CMOS technology. An extreme low power asymmetrical delay element with post-chip adjustment feature was developed mainly for solving these issues. Our initial intention was to develop a programmable delay element for asynchronous data path components. The proposed delay element is also suitable for many other applications requiring low power constraint. In addition to the programmability, the delay element also demonstrated efficiently characteristics such as good tolerance to process and temperature variations on the delay. Our delay element is equivalent to approximately the average power of a 4-stage inverter chain. A large delay can be obtained by cascaded scheme with nearly zero handshaking overhead. All arguments were cautiously verified by the post-layout simulation setup using TSMC 0.35 µm and 0.18 µm technologies under all extreme corners.

  • Energy-Efficient Pre-Execution Techniques in Two-Step Physical Register Deallocation

    Kazunaga HYODO  Kengo IWAMOTO  Hideki ANDO  

     
    PAPER-Computer Systems

      Vol:
    E92-D No:11
      Page(s):
    2186-2195

    Instruction pre-execution is an effective way to prefetch data. We previously proposed an instruction pre-execution scheme, which we call two-step physical register deallocation (TSD). The TSD realizes pre-execution by exploiting the difference between the amount of instruction-level parallelism available with an unlimited number of physical registers and that available with an actual number of physical registers. Although previous TSD study has successfully improved performance, it still has an inefficient energy consumption. This is because attempts are made for instructions to be pre-executed as much as possible, independently of whether or not they can significantly contribute to load latency reduction, allowing for maximal performance improvement. This paper presents a scheme that improves the energy efficiency of the TSD by pre-executing only those instructions that have great benefit. Our evaluation results using the SPECfp2000 benchmark show that our scheme reduces the dynamic pre-executed instruction count by 76%, compared with the original scheme. This reduction saves 7% energy consumption of the execution core with 2% overhead. Performance degrades by 2%, compared with that of the original scheme, but is still 15% higher than that of the normal processor without the TSD.

  • Power Minimization for Dual- and Triple-Supply Digital Circuits via Integer Linear Programming

    Ki-Yong AHN  Chong-Min KYUNG  

     
    PAPER-VLSI Design Technology and CAD

      Vol:
    E92-A No:9
      Page(s):
    2318-2325

    This paper proposes an Integer Linear Programming (ILP)-based power minimization method by partitioning into regions, first, with three different VDD's(PM3V), and, secondly, with two different VDD's(PM2V). To reduce the solving time of triple-VDD case (PM3V), we also proposed a partitioned ILP method(p-PM3V). The proposed method provides 29% power saving on the average in the case of triple-VDD compared to the case of single VDD. Power reduction of PM3V compared to Clustered Voltage Scaling (CVS) was about 18%. Compared to the unpartitioned ILP formulation(PM3V), the partitioned ILP method(p-PM3V) reduced the total solution time by 46% at the cost of additional power consumption within 1.3%.

  • A Low-Power High Accuracy Over Current Protection Circuit for Low Dropout Regulator

    Socheat HENG  Cong-Kha PHAM  

     
    PAPER-Electronic Circuits

      Vol:
    E92-C No:9
      Page(s):
    1208-1214

    In this paper, a low power current protection circuit implemented in a low dropout regulator (LDO) is presented. The proposed circuit, designed in a 0.35 µm CMOS process, provides a precise limiting current as well as holding current with low dependency on both supply voltage and regulator output voltage. The experimental results showed that the proposed circuit is operable in the regulator output voltage range from VOUT=1.2 V to VOUT=3.6 V and supply voltage range from VDD=VOUT+0.5 V to VDD=5.6 V. Since the proposed circuit is composed of few simple basic circuits such as a comparator and a Schmitt Trigger, it has a low current consumption of less than ISS=0.82 µA at a load current of ILOAD=200 mA. This makes the circuit suitable for low power and low voltage LDO design.

  • A 0.31 pJ/Conversion-Step 12-Bit 100 MS/s 0.13 µm CMOS A/D Converter for 3G Communication Systems

    Young-Ju KIM  Kyung-Hoon LEE  Myung-Hwan LEE  Seung-Hoon LEE  

     
    PAPER-Electronic Circuits

      Vol:
    E92-C No:9
      Page(s):
    1194-1200

    This work describes a 12-bit 100 MS/s 0.13 µm CMOS ADC for 3G wireless communication systems such as two-carrier W-CDMA applications. The proposed ADC employs a four-step pipeline architecture to optimize power consumption and chip area at the target resolution and sampling rate. Area-efficient gate-bootstrapped sampling switches of the input SHA maintain high signal linearity over the Nyquist rate even at a 1.0 V supply. The cascode compensation using a low-impedance feedback path in two-stage amplifiers of the SHA and MDACs achieves the required conversion speed and phase margin with less power consumption and area compared to the Miller compensation. A low-glitch dynamic latch in the sub-ranging flash ADCs reduces kickback noise referred to the input of comparator by isolating the pre-amplifier from the regeneration latch output. The proposed on-chip current and voltage references are based on triple negative TC circuits. The prototype ADC in a 0.13 µm 1P8M CMOS technology demonstrates the measured DNL and INL within 0.38LSB and 0.96LSB at 12-bit, respectively. The ADC shows a maximum SNDR and SFDR of 64.5 dB and 78.0 dB at 100 MS/s, respectively. The ADC with an active die area of 1.22 mm2 consumes 42.0 mW at 100 MS/s and a 1.2 V supply, corresponding to a figure-of-merit of 0.31 pJ/conversion-step.

  • Architectural Exploration and Design of Time-Interleaved SAR Arrays for Low-Power and High Speed A/D Converters

    Sergio SAPONARA  Pierluigi NUZZO  Claudio NANI  Geert VAN DER PLAS  Luca FANUCCI  

     
    PAPER

      Vol:
    E92-C No:6
      Page(s):
    843-851

    Time-interleaved (TI) analog-to-digital converters (ADCs) are frequently advocated as a power-efficient solution to realize the high sampling rates required in single-chip transceivers for the emerging communication schemes: ultra-wideband, fast serial links, cognitive-radio and software-defined radio. However, the combined effects of multiple distortion sources due to channel mismatches (bandwidth, offset, gain and timing) severely affect system performance and power consumption of a TI ADC and need to be accounted for since the earlier design phases. In this paper, system-level design of TI ADCs is addressed through a platform-based methodology, enabling effective investigation of different speed/resolution scenarios as well as the impact of parallelism on accuracy, yield, sampling-rate, area and power consumption. Design space exploration of a TI successive approximation ADC is performed top-down via Monte Carlo simulations, by exploiting behavioral models built bottom-up after characterizing feasible implementations of the main building blocks in a 90-nm 1-V CMOS process. As a result, two implementations of the TI ADC are proposed that are capable to provide an outstanding figure-of-merit below 0.15 pJ/conversion-step.

  • Design of 1 V Operating Fully Differential OTA Using NMOS Inverters in 0.18 µm CMOS Technology

    Atsushi TANAKA  Hiroshi TANIMOTO  

     
    PAPER

      Vol:
    E92-C No:6
      Page(s):
    822-827

    This paper presents a 1 V operating fully differential OTA using NMOS inverters in place of the traditional differential pair. To obtain high gain, a two-stage configuration is used in which the first stage has feedforward paths to cancel the common-mode signal, and the second stage has common-mode feedback paths to stabilize the output common-mode voltage. The proposed OTA was fabricated by an 0.18 µm CMOS technology. Measured gain is 40 dB and GBW is 10 MHz, in addition to differential output voltage swing of 1.8 Vp - p. It is confirmed that the proposed OTA can operate from 1 V power supply and has very large output swing capability even in a 1 V operation. The proposed OTA configuration contributes to a solution to the low power supply voltage issue in scaled CMOS analog circuits.

  • A Low Power Reconfigurable Channel Filter Using Multi-Band and Masking Architecture for Channel Adaptation in Cognitive Radio

    K. G. SMITHA  A. P. VINOD  

     
    PAPER-Digital Signal Processing

      Vol:
    E92-A No:6
      Page(s):
    1424-1432

    Cognitive radio (CR) is an adaptive spectrum sharing paradigm targeted to provide opportunistic spectrum access to secondary users for whom the frequency bands have not been licensed. The key tasks in a CR are to sense the spectral environment over a wide frequency band and allow unlicensed secondary users (CR users) to dynamically transmit/receive data over frequency bands unutilized by licensed primary users. Thus the CR transceiver should dynamically adapt its channel (frequency band) in response to the time-varying frequencies of wideband signal for seamless communication. In this paper, we present a low complexity reconfigurable filter architecture based on multi-band filtering and frequency masking techniques for dynamic channel adaptation in CR terminal. The proposed multi-standard architecture is capable of adapting to channels having different bandwidths corresponding to the channel spacing of time-varying channels. Design examples show that proposed architecture offers 12.2% power reduction and 26.5% average gate count reduction over conventional Per-Channel based architecture.

  • Design of Low Power QPP Interleave Address Generator Using the Periodicity of QPP

    Won-Ho LEE  Chong Suck RIM  

     
    LETTER-VLSI Design Technology and CAD

      Vol:
    E92-A No:6
      Page(s):
    1538-1540

    This paper presents two power-saving designs for Quadratic Polynomial Permutation (QPP) interleave address generator of which interleave length K is fixed and unfixed, respectively. These designs are based on our observation that the quadratic term f2x2%K of f(x) = (f1x+f2x2)%K, which is the QPP address generating function, has a short period and is symmetric within the period. Power consumption is reduced by 27.4% in the design with fixed-K and 5.4% in the design with unfixed-K on the average for various values of K, when compared with existing designs.

  • Reducing On-Chip DRAM Energy via Data Transfer Size Optimization

    Takatsugu ONO  Koji INOUE  Kazuaki MURAKAMI  Kenji YOSHIDA  

     
    PAPER

      Vol:
    E92-C No:4
      Page(s):
    433-443

    This paper proposes a software-controllable variable line-size (SC-VLS) cache architecture for low power embedded systems. High bandwidth between logic and a DRAM is realized by means of advanced integrated technology. System-in-Silicon is one of the architectural frameworks to realize the high bandwidth. An ASIC and a specific SRAM are mounted onto a silicon interposer. Each chip is connected to the silicon interposer by eutectic solder bumps. In the framework, it is important to reduce the DRAM energy consumption. The specific DRAM needs a small cache memory to improve the performance. We exploit the cache to reduce the DRAM energy consumption. During application program executions, an adequate cache line size which produces the lowest cache miss ratio is varied because the amount of spatial locality of memory references changes. If we employ a large cache line size, we can expect the effect of prefetching. However, the DRAM energy consumption is larger than a small line size because of the huge number of banks are accessed. The SC-VLS cache is able to change a line size to an adequate one at runtime with a small area and power overheads. We analyze the adequate line size and insert line size change instructions at the beginning of each function of a target program before executing the program. In our evaluation, it is observed that the SC-VLS cache reduces the DRAM energy consumption up to 88%, compared to a conventional cache with fixed 256 B lines.

  • A Way Enabling Mechanism Based on the Branch Prediction Information for Low Power Instruction Cache

    Gi-Ho PARK  Jung-Wook PARK  Hoi-Jin LEE  Gunok JUNG  Sung-Bae PARK  Shin-Dug KIM  

     
    LETTER

      Vol:
    E92-C No:4
      Page(s):
    517-521

    This paper presents a cache way enabling mechanism using branch target addresses. This mechanism uses branch prediction information to avoid the power consumption due to unnecessary cache way access by enabling only the cache way(s) that should be accessed. The proposed cache way enabling mechanism reduces the power consumption of the instruction cache by 63% without any performance degradation of the processor. An ARM1136 processor simulator and the Synopsys PrimeTime are used to perform the performance/power simulation and static timing analysis of the proposed mechanisms respectively.

  • A Reference Voltage Buffer with Settling Boost Technique for a 12 bit 18 MHz Multibit/Stage Pipelined A/D Converter

    Shunsuke OKURA  Tetsuro OKURA  Toru IDO  Kenji TANIGUCHI  

     
    PAPER

      Vol:
    E92-A No:2
      Page(s):
    367-373

    A reference voltage buffer for a multibit/stage pipelined ADC is described, where a settling boost technique is used to improve the settling response of the pipelined stages. A 12 bit 18 MHz pipelined ADC with the buffer is designed and simulated based on a 0.35 µm CMOS process. According to simulation results, the power consumed by the reference voltage buffer is reduced by 33% compared to that without the settling boost technique.

  • Evaluation of Interconnect-Complexity-Aware Low-Power VLSI Design Using Multiple Supply and Threshold Voltages

    Hasitha Muthumala WAIDYASOORIYA  Masanori HARIYAMA  Michitaka KAMEYAMA  

     
    PAPER-High-Level Synthesis and System-Level Design

      Vol:
    E91-A No:12
      Page(s):
    3596-3606

    This paper presents a high-level synthesis approach to minimize the total power consumption in behavioral synthesis under time and area constraints. The proposed method has two stages, functional unit (FU) energy optimization and interconnect energy optimization. In the first stage, active and inactive energies of the FUs are optimized using a multiple supply and threshold voltage scheme. Genetic algorithm (GA) based simultaneous assignment of supply and threshold voltages and module selection is proposed. The proposed GA based searching method can be used in large size problems to find a near-optimal solution in a reasonable time. In the second stage, interconnects are simplified by increasing their sharing. This is done by exploiting similar data transfer patterns among FUs. The proposed method is evaluated for several benchmarks under 90 nm CMOS technology. The experimental results show that more than 40% of energy savings can be achieved by our proposed method.

  • Cross-Layer Design for Low-Power Wireless Sensor Node Using Wave Clock

    Takashi TAKEUCHI  Yu OTAKE  Masumi ICHIEN  Akihiro GION  Hiroshi KAWAGUCHI  Chikara OHTA  Masahiko YOSHIMOTO  

     
    PAPER

      Vol:
    E91-B No:11
      Page(s):
    3480-3488

    We propose Isochronous-MAC (I-MAC) using the Long-Wave Standard Time Code (so called "wave clock"), and introduce cross-layer design for a low-power wireless sensor node with I-MAC. I-MAC has a periodic wakeup time synchronized with the actual time, and thus we take the wave clock. However, a frequency of a crystal oscillator varies along with temperature, which incurs a time difference among nodes. We present a time correction algorithm to address this problem, and shorten the time difference. Thereby, the preamble length in I-MAC can be minimized, which saves communication power. For further power reduction, a low-power crystal oscillator is also proposed, as a physical-layer design. We implemented I-MAC on an off-the-shelf sensor node to estimate the power saving, and verified that the proposed cross-layer design reduces 81% of the total power, compared to Low Power Listening.

  • Low Power Realization and Synthesis of Higher-Order FIR Filters Using an Improved Common Subexpression Elimination Method

    K.G. SMITHA  A.P. VINOD  

     
    PAPER-Digital Signal Processing

      Vol:
    E91-A No:11
      Page(s):
    3282-3292

    The complexity of Finite Impulse Response (FIR) filters is mainly dominated by the number of adders (subtractors) used to implement the coefficient multipliers. It is well known that Common Subexpression Elimination (CSE) method based on Canonic Signed Digit (CSD) representation considerably reduces the number of adders in coefficient multipliers. Recently, a binary-based CSE (BSE) technique was proposed, which produced better reduction of adders compared to the CSD-based CSE. In this paper, we propose a new 4-bit binary representation-based CSE (BCSE-4) method which employs 4-bit Common Subexpressions (CSs) for implementing higher order low-power FIR filters. The proposed BCSE-4 offers better reduction of adders by eliminating the redundant 4-bit CSs that exist in the binary representation of filter coefficients. The reduction of adders is achieved with a small increase in critical path length of filter coefficient multipliers. Design examples show that our BCSE-4 gives an average power consumption reduction of 5.2% and 6.1% over the best known CSE method (BSE, NR-SCSE) respectively, when synthesized with TSMC-0.18 µm technology. We show that our BCSE-4 offers an overall adder reduction of 6.5% compared to BSE without any increase in critical path length of filter coefficient multipliers.

  • Power and Skew Aware Point Diffusion Clock Network

    Gunok JUNG  Chunghee KIM  Kyoungkuk CHAE  Giho PARK  Sung Bae PARK  

     
    LETTER-Integrated Electronics

      Vol:
    E91-C No:11
      Page(s):
    1832-1834

    This letter presents point diffusion clock network (PDCN) with local clock tree synthesis (CTS) scheme. The clock network is implemented with ten times wider metal line space than typical mesh networks for low power and utilized to nine times smaller area CTS execution for minimized clock skew amount. The measurement results show that skew amount of PDCN with local CTS is reduced to 36% and latency is shrunk to 45% of the amount in a 4.81 mm2 CortexA-8 core with 65 nm Samsung process.

  • 4.8 GHz CMOS Frequency Multiplier Using Subharmonic Pulse-Injection Locking for Spurious Suppression

    Kyoya TAKANO  Mizuki MOTOYOSHI  Minoru FUJISHIMA  

     
    PAPER

      Vol:
    E91-C No:11
      Page(s):
    1738-1743

    To realize low-power wireless transceivers, it is necessary to improve the performance of frequency synthesizers, which are typically frequency multipliers composed of a phase-locked loop (PLL). However, PLLs generally consume a large amount of power and occupy a large area. To improve the frequency multiplier, we propose a pulse-injection-locked frequency multiplier (PILFM), where a spurious signal is suppressed using a pulse input signal. An injection-locked oscillator (ILO) in a PILFM was fabricated by a 0.18 µm 1P5M CMOS process. The core size is 10.8 µm10.5 µm. The power consumption of the ILO is 9.6 µW at 250 MHz, 255 µW at 2.4 GHz and 1.47 mW at 4.8 GHz. The phase noise is -105 dBc/Hz at a 1 MHz offset.

121-140hit(376hit)